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Abstract 

This  document  represents  the  final  report  on  the  various  scientific  activities  and  ac¬ 
complishments  relating  to  Grant  No.  FA9550-09-1-0495  over  its  period  of  performance, 
June  1,  2009  -  November  30,  2013.  The  project  had  the  following  three  overarching 
technical  objectives: 

1.  Numerical  evaluations  of  multivariate  statistical  entropies  and  information  for 
imaging  problems 

2.  bf  Information-theoretic  assessment  of  PSF  errors  and  their  impact  on  image 
quality  in  an  SOI  imaging  system 

3.  Optimal  choice  of  regularization  functional  in  image  reconstruction 

The  following  is  a  list  of  the  most  important  technical  accomplishments  of  the 
project: 

•  A  new  relation  between  two  seemingly  different  metrics  of  information  -  mutual 
information  (MI)  and  estimation  theoretic  Fisher  information  (FI)  in  the  high- 
SNR  limit,  for  arbitrary  input  and  channel  probability  distributions; 

•  New  numerically  efficient,  tight  upper  and  lower  bounds  on  successfully  transmit¬ 
ted  statistical  information,  namely  MI,  in  the  estimation  of  one  or  more  features 
of  a  target  based  on  image  data; 

•  Improved  approaches  for  an  exact  numerical,  Monte-Carlo-sampling  based  com¬ 
putation  of  MI  and  comparison  with  our  tight  upper  bounds  for  some  problems; 
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•  Bayesian  approaches  to  the  problem  of  optimal  selection  of  regularizer  and  its 
strength  based  on  data  themselves; 

•  The  development  of  a  tight,  variational  upper  bound  on  the  minimum  Bayesian 
estimation  error,  the  so  called  minimum  mean-squared  error  (MMSE),  that  is 
tighter  than  and  superior  to  several  more  widely  known  lower  bounds  on  the 
MMSE,  such  as  the  Ziv-Zakai  and  Weiss- Weinstein  bounds; 

•  An  improved  lower  bound  on  the  MMSE  that  is  tighter  than  the  standard  Ziv- 
Zakai  bound  and  is  based  on  the  minimum  probability  of  error  (MPE)  of  a  related 
M-ary  hypothesis  testing  based  Bayesian  inference; 

•  A  computational-imaging  approach  to  encoding  the  field  depth  of  a  target  using 
the  rotation  of  a  point-spread  function  based  on  the  orbital  angular  momentum 
(0AM)  states  of  light  -  this  can  be  employed  to  track,  fully  in  3D,  unresolved 
space  objects,  such  as  orbital  debris  that  could  pose  a  potential  threat  to  space 
assets;  and 

•  A  full  Bayesian  error  analyses  of  the  prospects  of  3D  localization  and  super¬ 
resolution  of  unresolved  sources  beyond  the  standard  diffraction  limit  of  an  imag¬ 
ing  system. 

These  technical  accomplishments  have  resulted  in  a  total  of  19  publications,  11  pre¬ 
sentations  (6  invited,  5  contributed),  and  1  US  patent  application. 
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1  Summary  of  Overall  Project  Accomplishments 

The  three  originally  proposed  objectives  of  the  effort  were  largely  achieved  over  the  project 
duration.  In  this  report  we  provide  only  a  summary  description  of  the  project  accomplish¬ 
ments,  leaving  the  many  technical  details  of  a  fuller  description  to  the  project-supported 
publications  that  are  cited  at  appropriate  places  below  and  are  publicly  accessible. 

An  overarching  objective  of  the  effort  was  to  construct  useful  statistical  formalisms  in 
which  one  can  analyze  the  identihcation,  characterization,  and  reconstruction  of  space  objects 
in  the  combined  spatial- spectral  domain.  The  formalisms  we  developed  during  the  project 
are  based  on  Bayesian  error  analysis  on  the  one  hand  and  statistical  information  on  the 
other,  and  exploit  new  insights  into  their  mutual  relationship  based  on  improved  versions  of 
certain  inequalities  such  as  the  Fano  bound. 

Our  technical  accomplishments  may  be  summarized  as  follows: 


1.1  A  New  Relation  between  Estimation-Theoretic  Fisher  Infor¬ 
mation  and  Mutual  Information 


An  important  task  was  to  unify  estimation  theoretic  notions  of  information  and  noise  perfor¬ 
mance  based  on  Fisher  information  (FI)  with  concepts  like  mutual  information,  compressibil¬ 
ity  and  complexity  described  by  Shannon’s  statistical  information.  FI  is  a  measure  of  local 
sensitivity  of  statistical  data  to  a  parameter  that  has  to  be  estimated  from  the  data.  As  such 
it  involves  a  statistical  average  of  a  bilinear  product  of  the  first  derivatives  of  the  logarithm 
of  the  data  probability  distribution  (PD)  with  respect  to  the  parameters  being  estimated. 
By  contrast,  mutual  information  (MI),  I(X;Y),  is  the  amount  of  statistical  information 
about  a  parameter  X  that  can  be  passed  by  a  communication  channel,  or  more  generally 
by  a  measuring  system,  through  its  statistical  output,  T,  in  the  presence  of  noise  and  other 
system  limitations.  It  is  thus  a  global  and  Bayesian  measure  of  information.  A  handful  of 
relations  are  known  between  FI  and  MI,  but  only  one  previously  known  such  relation  has 
any  physical  content,  namely  the  (negative)  difference  between  MI  and  the  statistical  en¬ 
tropy,  H{X),  of  the  input,  the  latter  representing  the  maximum  information  that  the  system 
can  transmit  under  ideal  circumstances  (i.e.,  no  noise  and  other  limitations)  depends  on  the 
FI  in  the  asymptotic  limit  of  many  repeated  measurements,  Yi,Y2, . . .  [see,  e.g.,  Kang  and 
Sompolinsky,  “Mutual  information  of  population  codes  and  distance  measures  in  probability 
space,”  Phys.  Rev.  Lett.,  86,  pp.  4958-4961  (2001)].  This  is  an  expected  result  since  in 
the  asymptotic  limit,  the  SNR  becomes  infinitely  large,  and  all  global  information  metrics 
become  local. 

The  relation  we  have  derived  is  a  different  one  that  applies  for  any  number  of  measure¬ 
ments,  Y  =  (Yi, . . .  ,Ym},  or  of  input  parameters,  X  =  (Xi, . . .  ,Xp}.  It  is  based  on  the 
observation  that  when  the  PD  of  X  is  sufficiently  narrow  the  local  and  global  measures  of 
information  are  expected  to  be  in  agreement.  That  this  is  indeed  true  is  contained  in  the 
following  expression  we  have  derived  that  is  valid  to  the  second  order  in  the  width  of  the 
X-PDF: 


/(X;Y) 


(1) 
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where  5xi  =  Xi  —  (X,)  denotes  the  linear  deviation  of  Xi  from  its  mean  value,  (Xj).  For  a 
single  input  parameter,  X,  this  reduces  to  the  following  simple  expression: 

/(X;  Y)  =  ^jdx  P{x)  {x  -  {X)fj{Y\x).  (2) 

Unfortunately,  this  result  does  not  seem  to  have  a  simple  generalization  for  arbitrary  widths 
of  the  input  PD,  for  which  global  and  local  metrics  of  information  must  in  general  be  quite 
different. 


1.2  Certain  Logorithmic  Upper  and  Lower  Bounds  on  Mutual  In¬ 
formation 


Computing  the  statistical  entropy  and  associated  MI  requires  the  evaluation  of  statistical 
averages  of  logarithms  of  PDs  and  their  ratios,  as  we  have  noted  earlier.  When  many  param¬ 
eters  and  measurements  are  involved,  these  averages  amount  to  high-dimensional  integrals 
which  require  efficient  numerical  approaches  like  Monte-Carlo  (MC)  integration  to  evalu¬ 
ate  to  high  accuracy.  However,  because  of  the  logarithm  inside  the  typical  integrand,  one 
must  sample  the  high-dimensional  configuration  space  of  these  variables  efficiently  over  vast 
regions  before  the  logarithm  is  sufficiently  attenuated  and  begins  to  contribute  negligibly 
in  order  to  yield  accurate  results.  It  is  important  therefore  to  consider  more  accurate  and 
efficient  approaches  to  MC  evaluations  of  MI. 

We  have  developed  useful  upper  and  lower  bounds  to  MI  that  altogether  avoid  integrating 
logarithms  of  PDs  and  their  ratios.  By  writing  \ii[P{y\x) / P{y)]  —  (l/a)  \ii[P{y\x) / P{y)]°'  = 
—  {l/l3)\ii[P{y\x)/P{y)]~^,  with  q;,/5  >  0,  and  using  the  convexity  of  the  logarithm  in  the 
equivalent  MI  expressions. 


I{X-,Y) 


dxdyP{x,y)  ln[P{y\x)/P{y)Y 


dxdyP{x,y)  ln[P{y\x)/ P{y)] 


(3) 


we  obtain  the  following  logarithmic  bounds  on  MI: 

■P(ylx)-'-'’ 


-iln 


dx  dy  P{x,  y) 


pw  j 


<  HX\Y)  <  -  In 

a 


dx  dy  P{x,  y) 


P{y\x) 

[  P(y)  J 


(4) 


These  constitute  a  family  of  logarithmic  bounds,  which  we  refer  to  as  a  and  fd  bounds,  which 
get  progressively  tighter  as  a  and  jd  decrease  from  positive  values  to  0. 

These  bounds  are  already  quite  tight  even  for  values  of  a,  jd  of  order  0.5  or  less,  as  we 
show  in  Figure  1  for  the  case  of  the  Poisson  channel  and  negative-exponential  statistics  for 
X.  Note  that  even  for  a,  Id  equal  to  1/2,  the  bounds  are  within  10%  of  the  exact  result. 
Indeed,  by  averaging  the  upper  and  lower  bounds  corresponding  to  the  same  value  of  a  and 
P,  we  are  likely  to  get  within  1-2%  of  the  exact  result,  obviating  thus  the  need,  in  the  exact 
MI  expression,  for  numerically  inefficient  averaging  of  logarithms  of  PDFs  that  may  become 
exceptionally  small  for  high-dimensional  spaces  of  input  and  data  variables. 


4 


w 

rs 

c 


Logarithmic  Bounds  on  Ml  Logarithmic  Bounds  on  Ml 


Figure  1:  Exact  MI  and  the  logarithmic  a  and  (3  bounds  on  it,  for  Poisson  channel  and 
negative  exponential  statistics  for  X.  The  conditional  mean  of  the  Poisson  channel,  given  x, 
is  ax  +  b,  where  a  is  the  linear  gain  factor  and  b  the  bias  of  the  channel.  The  mean  value  of 
X  is  (X). 
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1.3  Useful  Bayesian  Bounds  on  MI 

In  the  vein  of  discovering  further  bounds  that  avoid  the  numerical  inaccuracies  of  MC  based 
computations  of  MI,  such  as  Metropolis-Hastings  and  improved  versions  thereof,  we  turn  to 
the  fact  that  Bayesian  metrics  of  performance  can  be  quite  useful  in  this  respect.  The  two 
most  popular  Bayesian  metrics  are  the  MMSE  and  MPE,  both  of  which  relate  to  estimators 
that,  respectively,  are  the  mean  and  mode  of  the  posterior  PDE,  namely  P{x\y).  The 
expected  value  of  the  negative  logarithm  of  the  same  posterior,  or  the  equivocation  entropy, 
is  precisely  the  loss  of  statistical  information,  and  hence  the  fact  that  the  MI  is  related  to 
MMSE  and  MPE  is  no  serious  surprise.  It  is  this  relationship  and  its  many  consequences 
for  bounding  MI  in  terms  of  these  Bayesian  metrics  as  well  as  the  equivalence  of  MMSE  and 
MPE  to  each  other  in  the  high  SNR  limit  that  we  studied  in  detail  under  the  grant  effort. 


1.3.1  A  Generalized  Fano  Lower  Bound  on  MI  for  a  Continuous  Source 

The  usual  Fano  inequality  connecting  the  probability  of  error  in  estimating  a  source  from  its 
measurements  to  the  equivocation  entropy  has  been  an  excellent  tool  in  relating  Bayesian 
detection  and  estimation  analysis  to  statistical  information  theory.  It  is  however  based  on 
the  discreteness  of  the  sample  space,  A,  of  the  source  variable,  X,  for  which  the  following 
Fano  bound  is  valid: 


I{X-Y)  >  H{X)  -  H{P,)  -  Pelog2(|T|  -  1)  >  H{X)  -  1  -  PJog^  |T|,  (5) 


in  which  Pg  is  the  MPE  in  predicting  the  value  of  a  discrete  random  variable  X  from  data 

F. 

We  derived  a  version  of  the  Fano  bound  that  applies  to  a  continuous  source  variable.  If 
one  defines  an  e-error  variable  Eg  as 


r  1  if  |X  -X|  >  e 
to  if  |X  -X|  <  e. 


(6) 


then  the  analog  of  inequality  (??),  as  we  have  shown,  becomes 

I(X;X)  >  h(X)  -  H{P,)  -  (1  -  Pg)  log2e  -  Pg^  log(27re MMSE^),  (7) 

where  MMSE^  is  the  (Bayesian)  minimum  MSE  for  the  joint  PDF  defined  only  over  the 
excluded  region,  \X  —  X\  >  e.  The  practical  usefulness  of  this  result  is  unclear,  since  it 
involves  the  MMSE  which  even  for  the  fully  included  X,  X  region  is  usually  very  hard  to 
compute.  However,  it  may  at  the  very  least  provide  a  useful  formal  tool  to  prove  new 
theoretical  relations  between  the  Bayesian  and  information  theoretic  approaches  for  upper- 
bounding  system  performance. 


1.3.2  Other  Bayesian  Bounds  on  MI 

For  a  discrete  source  variable,  X,  the  equivocation  entropy  H{X\Y),  which  is  simply  — Elog  P{X\Y), 
is  clearly  bounded  below  by  — E log  P{x:rn»(Y)\Y),  where  m^{Y)  is  the  MAP  estimator,  namely 

m^{Y)  =  argmax^  {P{xm\Y)}  ,  (8) 
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in  which  {xi, . . . ,  xm}  is  the  sample  space  of  the  source  X.  Since  the  logarithm  is  a  concave 
function,  it  then  follows  that 

HiX\Y)  > -lnEYP{xm4Y)\Y)  =  -ln(l  -  Pi"**")),  (9) 

where  pj"**")  jg  the  MPE.  This  is  the  so-called  Feder-Merhav  (FM)  bound  derived  two  decades 
ago.  The  FM  bound  is  typically  not  sharp.  In  our  work,  we  found  a  number  of  improvements 
on  the  FM  bound  that  make  it  much  tighter,  as  we  show  for  certain  generic  problems  we 
considered. 

One  tighter  bound  follows  from  writing 


-logP(X|F) 


logmaxP(a:,„|F)  —  log 

m 


P{X\Y)/miXKP{x^\Y)  , 

m 


(10) 


and  then  calculate  its  expectation.  Performing  this  expectation  in  two  steps,  first  over  the 
posterior  and  then  over  the  data  T,  and  then  using  the  concavity  of  the  logarithm  on  the 
RHS  above,  we  may  obtain  the  following  lower  bound  on  EE: 


H{X\Y)  >  -logEymaxP(a;^|F)  +  A 

m 

=  -log(l-Pi"*'**))  + A, 

where  A  denotes  the  expression 

[max,„P(a:^|y)J 

By  means  of  the  Bayes  theorem,  P{xm\y)  =  P{y\xm,)Pm/ P{y),  we  may  express  A  as 

Pm  P^iy\Xm) 


A  =  —  log 


Yjoy- 


Pmt(y)  P{y\^mt(y)) 

with  the  corresponding  lower  bound  on  EE  via  (11)  being 

H{e\x)  >  -  log  (1  - 

Pm  P^ix\0m) 


-log 


J  dx 


Pmt{x)  Pi^\(^mt{x)) 


(11) 


(12) 


(13) 


(14) 


Subtracting  the  RHS  of  bound  (14)  from  H(X)  yields  the  corresponding  upper  bound  on 

MI. 

If  instead  of  (11),  we  write  the  exact  equality 


H{X\Y)  =  -Ey  logmsix P{xm\Y)  +  A' , 

m 

where  A'  is  the  difference 

A'  =  — E  log  1 1  — 


max^P(x,,|F)-P(A|y) 


max^  P((Cf^|T) 


(15) 


(16) 
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(17) 


and  then  use  the  inequality  —  ln(l  —  u)  >  u,  valid  for  0  <  n  <  1,  to  upper-bound  A'  as 


A'  >  (loge) 


Pm  P^jyM 

Pmt{y) 


This  bound  involves  the  same  integral  as,  but  is  looser  than,  (13).  Since  the  first  term  on 
the  RHS  of  expression  (15)  is  lower-bounded  by  —  logE^  max^  P(dm|-^),  the  EE  is  then 
lower-bounded  as 


H{e\X)  >  -log{l  - 


+  (log  e) 


1 


pIi  P‘^{y\Xm) 

Pmt(y) 


(18) 


As  before,  subtracting  the  RHS  of  bound  (18)  from  H{X)  yields  the  corresponding  upper 
bound  on  ML 

The  two  lower  bounds  on  H[X\Y),  (14)  and  (18),  the  latter  tighter  than  the  former,  cor¬ 
respond  to  fairly  tight  upper  bounds  on  MI  and  are  denoted  as  SUBl  and  SUB2,  respectively, 
in  Fig.  2.  The  looser  Fano  lower  bound  and  the  looser  FM  upper  bound  are  also  plotted 
and  indicated  as  FLB  and  FM-UB,  respectively,  on  the  same  figure  which  also  displays  the 
numerically  exact  MI.  In  this  figure,  the  fractional  MI,  namely  /(A;  Y)/ H[X),  refers  to  that 
for  the  joint  angular  position-chemical  type  feature,  A,  of  a  distant  muzzle  flash  which  is 
observed  in  a  number  of  spectral  bands  by  a  array  of  single-pixel  sensors  (“light  buckets”) 
mounted  on  a  hemispherical  surface  like  a  soldier’s  helmet  as  shown  in  Fig.  3. 

More  details  of  this  work  appear  in  a  technical  report  prepared  for  a  DARPA  contract 
which  partially  supported  the  work. 


A  Useful  Upper  Bound  for  the  Uniform  Prior  Denoting  the  difference  of  the  posterior 
from  the  prior  as  SP{X\Y)  =  P(X\Y)  —  P{X),  we  may  express  the  sum  in  Eq.  (12)  as 

J2P\^m\Y)  =  J2[Pm  +  SP{^m\Y)f 

m  m 

=  +  +  2PmSP{x^\Y)]  .  (19) 

m 

For  the  uniform  prior,  the  last  term  in  expression  (19)  vanishes,  since  the  posterior,  like  the 
prior,  is  a  probability  distribution  that  sums  to  1,  while  the  first  term  in  the  second  expression 
there  is  simply  1/M.  In  this  case,  the  sum  (19)  thus  differs  from  the  corresponding  sum  over 
the  prior  proabilities,  namely  1/M,  by  terms  that  are  of  quadratic  order  in  the  deviations 
SP{xm\Y).  This  allows  us  to  write  the  lower  bound  on  equivocation  as 

H{X\Y)  =  -ElogP(A|T) 

>  -logEP(e|y) 

=  -logEYJ2P"i^n^\Y) 

l  +  MEYj2^P"i^m\Y)  , 


=  log  M  —  log 


m 


(20) 


Fractional  Ml  for  Joint  Material-Position  Estimation 


Figure  2:  Fractional  mutual  information  (fMI)  vs.  peak  signal-to-background  ratio  (pSBR). 
Numerically  exact  results  are  shown  via  dashed  line  segments,  while  the  Fano  lower  bound 
and  the  various  upper  bounds  are  shown  via  different  marker  symbols  and  colors.  In  all 
cases,  the  error  in  calculating  the  various  results  due  to  hniteness  of  sampling  was  smaller 
than  the  size  of  the  marker  symbol  itself. 


Figure  3:  A  schematic  of  the  hemispherical  mount  for  single-pixel  multi- spectral  sensors 
arranged  regularly  in  rings  on  the  mount.  Arrows  indicate  possible  angular  directions  of 
arrival  of  a  distant  muzzle  flash. 
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so  the  MI  is  upper  bounded  as 


I{X-Y)  =  H{X)  -  H{X\Y)  <\og 


1  +  MEy 

m 


(21) 


To  obtain  the  first  inequality  in  relation  (20),  we  used  the  concavity  of  the  logarithm,  and 
then  decomposed  the  full  expectation  into  an  expectation  over  the  posterior  followed  by  an 
expectation  over  the  data  before  substituting  expression  (19)  for  the  sum.  The  MI  bound 
(21)  is  fairly  tight,  and  interpolates  between  the  log  M  value  of  MI  in  the  high-discrimination 
regime  where  only  one  of  the  posterior  probabilities  is  close  to  1  and  all  the  other  (M  —  1) 
ones  are  close  to  0,  and  the  vanishing  value  of  the  MI  in  the  low-discrimination  regime 
where  the  correction  contained  in  the  sum  on  the  RHS  in  the  upper  bound  (21)  tends  to 
be  small.  While  numerically  inefficient  to  compute,  in  general,  this  bound  has  a  very  useful 
interpretation  because  of  its  analogy  with  the  information  capacity  of  an  additive  Gaussian 
channel  for  which  the  ratio  of  the  mean  squared  deviations  of  the  posterior  from  the  prior 
and  the  uniform  squared  prior  probability,  1/M^,  serves  as  an  effective  SNR. 


1.4  Optimal  Choice  of  Regularization  Functional  in  Image  Recon¬ 
struction 

In  the  first  3.5  years  of  the  project,  we  studied  in  much  detail  the  notion  of  accommodation 
by  a  regularizer  defined  as  the  degree  of  degradation  of  reconstruction  as  the  regularization 
strength  is  increased  beyond  its  optimal  value.  This  is  a  measure  of  how  well  a  regularizer 
continues  to  “fit”  the  reconstruction  even  when  it  dominates  the  fit-to-data  term  in  the 
minimization  functional.  A  good  measure  of  the  fit  is  the  MSE,  defined  as  the  squared 
error  in  the  reconstructed  signal  relative  to  the  truth  signal  averaged  over  the  Bayesian  prior 
defined  over  the  parameters  that  define  the  object  class.  By  considering  error  in  different 
attributes  of  the  brightness  distribution  of  the  sources  in  the  family,  e.g.,  intensity  gradients, 
rather  than  intensities  themselves,  one  can  in  this  approach  rank  the  various  regularizers 
according  to  their  ability  to  accommodate  different  attributes  such  as  edges,  lines,  and 
points. 

We  found  by  means  of  extensive  simulation  the  superiority  of  Laplacian  regularizer  when 
compared  to  the  Tikhonov,  maximum-entropy  (ME),  and  total- variation  (TV)  regularizers, 
for  the  object  class  that  contains  smooth  Gaussian- shaped  sources.  Similar  studies  were 
extended  to  other  classes  of  sources  too,  those  containing  rect  functions  and  combinations 
of  Gaussian  and  rect  functions. 

Most  recently,  our  attention  was  focused  on  performing  this  analysis  without  the  use 
of  computer  simulation  so  the  reconstruction  may  be  calculated  analytically  and  the  data 
statistics  imposed  essentially  exactly  on  the  MSE  calculations.  We  have  succeeded  partially 
in  this  work  by  restricting  attention  to  quadratic  regularizers  of  form 

T^iK)  =  (22) 

Li 

where  R  is  a  real  symmetric  matrix  with  positive  eigenvalues  and  A  is  the  vector  of  signal 
intensities  at  its  pixels.  By  choosing  three  different  regularizer  matrices  R,  we  have  been 
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Comparison  of  MSE  of  the  quadratic  regularizers  in  Fourier  and  real  domain  for  PSNR  =  1 C 


(a)  (b) 

Figure  4:  (a)  The  brightness  distribution  of  a  typical  ID  signal  consisting  of  three  Gaussian 
shaped  peaks,  (b)  The  MSE  vs.  regularizer  strength  cnrves  in  a  log-log  plot  for  three 
different  quadratic  regularizers,  as  indicated  in  the  legend  on  the  plot.  The  PSNR  value  was 
fixed  at  10  for  this  plot. 


able  to  model  reconstructions  under  Tikhonov,  Laplacian,  and  curvature  regularizations,  and 
compare  the  curvatures  of  the  corresponding  logMSE  vs.  log  A  curves  at  their  minima  as 
a  measure  of  accommodation.  As  seen  below  in  Eig.  4  (b),  the  curvatnre  regnlarizer  seems 
to  do  the  best,  the  curvatnre  of  the  corresponding  MSE  curve  distinctly  being  the  smallest 
for  it.  The  class  of  signals  we  considered  for  these  plots  consists  of  three  Ganssian  shaped 
peaks  in  one  dimension  (ID),  as  shown  in  Fig.  4  (a).  We  have  derived  analytically  an  exact 
closed-form  expression  for  the  MSE,  which  we  numerically  evaluated  either  individually  for 
each  member  of  the  object  class  or  at  once  for  the  whole  class  by  statistically  averaging  the 
analytical  expression  over  the  prior  defining  the  class. 

1.5  A  Variational  Upper  Bound  on  the  MMSE  of  Bayesian  Esti¬ 
mation 

A  number  of  lower  bounds  on  the  MMSE,  including  Ziv-Zakai  (ZZB)  and  Weiss- Weinstein 
(WWB)  families  of  bounds,  have  been  proposed  and  studied  over  the  past  40  years.  These 
lower  bounds  have  reformnlated  the  problem  of  Bayesian  estimation  in  clever  ways  bnt  they 
still  have  been  difficult  to  calculate  and  apply  to  practical  problems. 

In  order  to  overcome  the  problem  of  efficient  compntability  of  the  MMSE  and  the  cor¬ 
responding  MMSE  estimator  for  general  prior  and  channel  PDFs,  we  turned  out  attention 
instead  to  simple  upper  bounds  on  the  MMSE.  We  have  derived  one  such  upper  bound  that 
seems,  in  all  cases  we  have  stndied  so  far,  to  be  easy  to  compute  and  rather  tight.  Also,  the 
celebrated  lower  bonnds,  ZZB  and  WWB,  can  be  shown  in  some  cases  to  be  qnite  loose  and 
thus  of  little  value  in  accnrately  predicting  error  performance  bounds,  particularly  for  space 
imaging  applications. 

The  minimum  value  of  the  MSE  over  the  space  of  all  possible  estimators  is  attained  by 
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the  minimum  MSE  estimator  (MMSEE),  which  is  the  posterior  mean  of  the  parameter  X, 


Xm{Y)=^x\y{X)  =  j  xP{x\Y)dx.  (23) 

The  MSE  of  this  estimator  is  the  MMSE,  8m,  which  can  be  simplified  to  the  form 

8m  =E[Xm(T)  - 

=4  -  4m  (24) 

involving  the  variances  of  the  prior  and  the  MMSEE,  namely  and 

If  X{Y)  is  another  estimator  of  X,  then  by  writing  X  —  X  =  (X  —  Xm)  +  {Xm  —  X), 
taking  its  mean  squared  value,  and  using  Eqs.  (23)  and  (24),  we  may  express  its  MSE  in  the 
expanded  form 

f  =8m  +  E[X(y)  -  XM{Y)f  +  2E[X(y)  -  Xm{Y)][Xm{Y)  -  X] 

=8M  +  E[X{Y)-XM{Y)f,  (25) 

which  is  obviously  bounded  below  by  the  MMSE,  8m-  The  second  term  in  the  second 

equality  in  (25)  is  the  amount  by  which  the  MSE  for  an  estimator  must  exceed  the  MMSE. 
By  subtracting  this  term  from  both  sides  of  the  equality  and  noting  that  this  term  cannot  be 
smaller  than  the  square  of  the  mean  E[X(F)  —  Xm{Y)],  we  have  the  following  upper  bound 
(UB)  on  the  MMSE: 

8M<8-E^[XiY)-XMiY)] 

=E[X(F)  -  -  E^[X(F)  -  X] 

=E{[X(y)  -  E(X)]  -  [X  -  E(X)]}2  =  8ub,  (26) 

where  the  second  line  follows  from  the  fact  that  the  mean  value  of  the  MMSEE  is  simply  the 
mean  value  of  the  prior,  Ex(X).  The  third  line  represents  a  convenient  way  of  combining 
the  two  terms  in  the  second  fine  and  then  regrouping  them  inside  the  square. 

The  rather  simple  upper  bound  (26)  has  the  immediate  benefit  that  it  can  be  computed 
readily  since,  unlike  the  MMSE,  it  does  not  require  any  knowledge  of  the  posterior  PD, 
Px\Y,  the  chief  bane  of  any  MMSE  calculation.  Furthermore,  since  according  to  Eq.  (25) 
the  MSE  of  any  estimator  differs  from  the  MMSE  in  the  quadratic  order  in  the  difference 
between  the  two  estimators,  a  variational  approach  to  estimate  the  MMSE  by  minimizing 
the  UB  (26)  with  respect  to  classes  of  estimators  should  yield  excellent  accuracy  even  when 
the  estimator  itself  does  not  approximate  the  MMSEE  as  well.  An  analogous  situation  exists 
for  the  variational  method  for  estimating  the  ground-state  energy  of  a  quantum  mechanical 
energy  problem.  Note  further  that  being  smaller  than  the  MSE,  which  is  the  first  term  in 
the  second  line  of  (26),  the  UB  obtained  from  any  trial  estimator  is  already  closer  to  the 
MMSE  than  its  own  MSE,  and  thus  presumably  furnishes  a  better  starting  point  and  a  faster 
convergence  to  the  MMSE. 
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1.5.1  MMSE  Upper  Bound  for  a  Polynomial  Family  of  Estimators 

Consider  first  the  simplest  problem  of  a  single  parameter  being  estimated  from  a  single 
observation  and  estimators  belonging  to  the  class  of  A^th-order  polynomials,  i.e., 

(27) 

We  shall  regard  {a„}  as  variational  parameters  that  need  to  be  chosen  to  minimize  the  UB 
(26).  Substituting  the  form  (27)  into  the  formula  (26),  we  may  express  the  UB  as 

Sub  =E 

=a^Ma  —  2a^n  +  (28) 

where  a  denotes  the  vector  (ai, . . . ,  a^v)^  {T :  matrix  transpose),  M  an  Al  x  Ai  matrix  with  el¬ 
ements,  Mmn  =  E(hF™hF”),  m,n  =  1, . . .  ,N,  and  n  an  Al  x  1  vector  of  elements  E((5X  SY"^), 
n  =  1, ...  ,N.  The  symbol  h  preceding  a  variable  denotes  its  deviation  from  its  mean  value, 

6Y^  =  Y^  (29) 

Since  the  matrix  M  is  positive  definite,  the  quadratic  problem  (28)  has  a  single  extremum 
that  is  its  absolute  minimum.  Its  location  in  the  space  of  the  parameter  vector  a  is 
determined  by  setting  the  gradient  of  the  expression  (28)  with  respect  to  the  parameter 
vector  a  equal  to  0,  namely  Ma^  =  v,  or  equivalently  a*  =  M“U.  The  corresponding 
minimum  value  of  the  UB  (28)  then  simplfies  greatly  to  the  final  form 

Sub*  =  <^x  ~  (30) 

The  minimum  value  of  the  UB  (30)  is  both  smaller  than  the  prior  variance,  cr|-,  and 
lowered  in  general  as  the  order  of  the  polynomial,  N,  increases.  To  prove  the  latter  assertion, 
we  simply  note  that  the  class  of  {N  +  l)th-order  polynomials  includes  as  a  proper  subset  the 
class  of  Aith  order  polynomials.  Thus  the  minimum  of  the  UB  over  the  former  class  cannot 
be  larger  than  the  that  over  the  latter  class.  A  more  rigorous  mathematical  proof  may  also 
be  given  based  on  Schur’s  inversion  formula  for  the  block-matrix  form  of  a  square  matrix. 

Curiously,  the  overall  additive  constant  oq  in  the  trial  form  (27)  of  the  estimator  is  left 
undetermined.  This  is  not  a  surprise  since  the  UB  form  (28)  is  clearly  insensitive  to  any 
overall  additive  constants.  Yet,  such  a  constant  is  in  general  included  in  the  actual  form 
of  the  MMSEE,  Xm,  e.g.,  for  a  Gaussian  channel  and  a  Gaussian  prior,  Xm  is  a  weighted 
sum  of  the  prior  mean,  which  is  a  constant,  and  the  data.  We  can  thus  estimate  the  form 
of  the  MMSEE  only  up  to  an  arbitrary  additive  constant  even  when  the  UB  approximates 
the  MMSE  very  well.  To  fix  this  constant,  we  need  an  additional  constraint  obeyed  by  the 
MMSEE,  namely  that  its  mean  be  the  same  as  the  prior  mean. 


N 

Xx{Y)=ao  +  Y,<^nY^- 

n=l 
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Multi-Parameter  Generalization  of  the  MMSE  and  Upper  Bound  As  is  well 
known,  the  form  (23)  of  the  MMSEE  remains  valid  for  the  most  general  definition  of  the 
MSE  for  multivariate  parameter  and  data  vectors, 

S  =  E{[X(i:)  -  XfR  [X(Y)  -  X]},  (31) 

where  R  is  a  positive-definite,  symmetric,  real  matrix.  We  need  only  interpret  (23)  as  being 
the  MMSE  estimator  for  each  parameter. 


XnM{y)  =  ExlriXn)  =  j  P{Xn\Y_)  dXn,  U  =  1,  .  .  .  ,  W,  (32) 

where  is  the  total  number  of  parameters  being  estimated.  Consequently,  the  upper-bound 
formula  (25)  undergoes  a  simple  modification  to  the  form 

where 

Sub  =  E{[(5X(F)  -  5A]^R[(5X(Y)  -  hX]}-  (33) 

1.5.2  A  MAP  Estimator  Based  MMSE  Upper  Bound 

One  approach  for  choosing  a  good  trial  estimator  is  to  use  a  weighted  sum  of  the  MAP 
estimator  and  the  mean  value  of  the  prior  and  to  pick  the  two  weights  by  requiring  that 
the  expected  value  of  the  estimator,  like  the  true  MMSE  estimator,  be  the  same  as  the 
prior  mean  and  that  the  MSE  computed  for  this  estimator  be  as  small  as  possible  (for  the 
tightest  UB).  The  two  requirements  fix  the  weights,  thus  yielding  both  an  approximate  value 
(in  the  UB  sense)  of  the  MMSE  and  a  good  approximation  to  the  true  MMSE  estimator. 
Some  results  of  this  approach  are  shown  in  Fig.  5  for  the  problem  of  estimating  the  center 
of  a  Gaussian-shaped  source  on  a  pixel  line  in  the  presence  of  Gaussian  noise.  We  have 
plotted  the  MMSE,  calculated  numerically  (with  difficulty),  and  the  UB  obtained  by  the 
approach  we  have  just  outlined.  Other  well  known  bounds,  specifically  the  Ziv-Zakai  and 
Weiss- Weinstein  bounds  that  are  lower  bounds  on  MMSE,  are  also  presented  on  the  same 
plot.  The  superiority  of  the  UB  obtained  in  our  work  is  evident  as  it  hugs  the  true  MMSE 
more  closely  than  the  two  lower  bounds  as  well  as  a  pure  MLE  based  UB,  as  shown. 

1.5.3  A  Quasilinear  Variational  Bound:  Piecewise  Linearization 

A  typical  issue  facing  the  choice  of  a  good  trial  estimator  is  the  nonlinear  dependence  of 
the  data  mean  on  the  parameter  being  estimated.  In  the  ID  localization  problem,  the 
nonlinearity  is  present  in  the  Gaussian  dependence  of  the  mean  data  vector  on  the  center 
position  of  the  true  Gaussian  signal,  denoted  by  G{9),  where  6  is  its  center  position  being 
estimated  here.  Gorrespondingly,  a  good  estimator  of  the  center  position  cannot  depend 
linearly  on  the  data,  making  the  choice  of  a  good  estimator  a  difficult  one.  As  we  have  seen, 
a  weighted  MAP  estimator  provides  a  good  choice,  but  determining  the  MAP  estimator  iself 
may  be  a  highly  daunting  task  requiring  extensive  nonlinear-optimization  based  computation. 
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1 D  source  localization 


Figure  5:  The  MMSE  and  various  lower  and  upper  bounds  on  it  for  the  problem  of  ID 
source  localization  in  the  presence  of  Gaussian  additive  noise 


A  good,  computationally  simple  alternative  to  the  MAP  approach  is  to  approximate  the 
required  estimator  piecewise,  so  it  works  well  for  a  wide  interval  of  prior  values  over  which  the 
mean  data  vector  depends  nonlinearly.  Since  over  a  small  sub-interval  of  parameter  values 
the  data  may  be  regarded  as  being  approximately  linearly  dependent  on  the  parameter,  a 
good  approximation  to  the  estimator  may  be  had  via  the  following  piecewise  linearized  form: 

Ne 

A,6)  =  Y.^k  +  ajY)  exp[-\\Y-Gm\yw%  (34) 

i=l 

where  G{9i)  is  the  vector  of  intensity  values  that  depend  on  the  pixel  index  via  the  given 
Gaussian  function  evaluated  for  the  center  position  6  being  equal  to  the  mid-point  of  the 
ith  subinterval  of  the  full  6  interval.  The  use  of  such  Gaussian  weight  enforces  in  the  above 
linear  sum  enforces  its  piecewise  linearity,  while  allowing  for  linear  variational  parameters  in 
terms  of  the  matrix  A  of  column  vectors  representing  linear  gain,  and  the  elements  of  the 
vector  b  linear  bias,  one  for  each  subintervah  In  spite  of  the  nonlinearity  posed  by  the  data- 
dependent  Gaussian  weights,  the  calculation  of  the  MSE  for  the  estimator  ()  is  analytically 
straightforward  since  its  form  is  identical  to  the  Gaussian  form  of  the  conditional  PDF  of 
the  data  vector  T,  given  9.  Indeed,  the  same  analytical  expediency  can  be  built  into  any 
specific  form  of  the  conditional  data  PDF,  as  long  as  there  is  a  one-to-one  correspondence, 
however  nonlinear,  between  the  mean  data  vector  and  the  parameter  value  and  the  weight 
factor  is  chosen  to  be  of  the  same  analytical  form  as  the  conditional  data  PDF. 
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1 D  source  localization 


Figure  6:  The  MMSE  and  various  upper  bounds  on  it  for  the  problem  of  ID  pulse-center 
localization  from  a  time  series  in  the  presence  of  Gaussian  additive  noise 


In  Fig.  6,  we  display  the  results  for  the  numerically  exact  MMSE  and  its  upper  bounds 
computed  by  variational  MAP  and  piece- wise  linearlization  approaches  covered  above,  for  the 
problem  of  estimation  of  the  center  of  a  pulse  that  is  corrupted  by  additive  Gaussian  noise. 
The  width  of  the  pulse  is  y/3  units,  while  the  prior  on  the  center  is  distributed  uniformly 
over  the  range  (30,40)  (in  the  same  units).  The  SNR  plotted  on  the  x  axis  is  simply  the 
ratio  of  the  total  pulse  signal  energy  and  the  total  noise  energy,  E/N,  with  the  noise  energy, 
N,  being  the  variance  of  the  read  noise  per  time  sub-interval,  taken  to  be  0.1  long,  times 
the  total  number  of  such  sub-intervals,  about  700  in  all,  in  the  full  time  series.  The  MMSE 
and  MAP-based  variational  bounds  were  computed  for  50  different  noise  realizations,  with 
corresponding  finite  but  sufficiently  error  bars  on  those  bounds  shown  in  the  plot.  Note  the 
excellent  accuracy  provided  by  the  piece- wise  linearized  versions  of  the  variational  estimator, 
for  A^i  =  1,  4,  and  10  subintervals  over  which  the  full  range  (30,40)  of  the  pulse  center,  6,  is 
subdivided  according  to  the  piece-wise  linearization  protocol  described  above.  As  expected, 
the  agreement  of  the  coresponding  bounds  with  the  exact  MMSE  improves  dramatically  from 
1  to  4  to  10  subdivisions  of  this  range. 

Other  problems,  including  2D  point-source  localization  using  a  imager  with  Gaussian 
PSE  and  different  noise  statistics,  such  as  Poisson  conditional  data  treated  via  a  pseudo- 
Gaussian  PDE  approximation,  have  also  been  treated  for  our  MAP-based  and  piecewise 
linearized  variational  upper  bounds.  In  each  case,  the  upper  bounds  obtained  in  these  ways 
have  been  found  to  be  tight  and  numerically  efficient  to  compute. 
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a  =  V(r-1) 

'k\  wavelength 
r:  index  of  refraction 
1=1, 2, ..,L:  zone  index 


Schematic  of  a  5-zone  element  (L=5) 
(Color-coded  for  ease  of  visualization) 


Figure  7:  A  schematic  of  the  optical  element  with  its  spiral  phase  retardation 


1.6  3D  Imaging  and  Source  Ranging  via  a  Rotating  Point-Spread 
Function 

In  a  recent  work  [3] ,  we  have  proposed  and  evaluated  a  novel  rotating  PSF  design  by  means 
of  wavefront  coding.  The  imaging  pupil  in  this  design  is  subdivided  into  Fresnel  zones,  with 
each  zone  carrying  a  radially  uniform  but  azimuthally  spiraling  phase  profile  with  an  integral 
number  of  turns  in  a  complete  circuit  around  the  pupil  center.  The  number  of  turns  of  the 
spiral  phase  is  taken  to  increment  by  1  between  two  successive  Fresnel  zones.  For  such  a 
pupil  phase  plate,  shown  in  Fig.  7,  the  PSF  merely  rotates  with  changing  defocus  of  the 
source  point,  without  spreading  significantly,  over  a  considerable  range  of  defocus  values. 
While  the  PSF  itself  is  not  as  compact  as  the  in-focus  Airy  disk  PSF  obtained  with  a  clear, 
well-corrected  pupil,  its  non-spreading  character  lends  it  a  potential  to  determine  the  depth 
distribution  of  a  collection  of  point  sources  in  a  single  snapshot.  This  increased  efficiency, 
with  little  degradation  of  sensitivity  with  defocus,  enables  it  to  provide  a  fast  image  of  a 
3D  scene  of  point  sources  without  having  to  refocus  the  imager  at  different  focal  depths  in 
a  sequential  manner.  Applications  to  space-borne  imagers  that  provide  surveillance  of  the 
space  environment  in  the  vicinity  of  a  space  asset  against  debris  and  other  potential  threats 
are  immediate  for  this  imager. 

For  such  a  phase  mask  in  the  imaging  pupil,  the  resulting  PSF  rotates  with  defocus, 
largely  maintaining  its  shape  and  size  over  a  large  range  of  values  of  the  defocus.  This  is 
illustrated  well  in  Fig.  8  where  we  make  color-coded  surface  plots  of  the  PSF  for  a  number 
of  different  values  of  the  defocus-induced  quadratic  phase, C,  at  the  edge  of  the  pupil.  The 
top  row  displays  the  PSF  rotation  for  the  spiral  pupil-phase  mask  described  here,  while  the 
bottom  row  presents  the  corresponding  conventional  clear- aperture  PSFs  that  exhibit  large 
spreading  and  loss  of  sensitivity  over  the  same  range  of  defocus  values. 

The  wavefront-coded  imager  works  in  the  trade  space  of  transverse  and  longitudinal  reso- 
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(a)  (b)  (c)  (d)  (e) 


(f)  (g)  (h)  (i)  (j) 


Figure  8:  Surface  plots  of  the  rotating  PSF,  with  L  =  7  Fresnel  zones  (top  row).  The  plots 
from  left  to  right  are  for  increasing  values  of  defocus,  namely  -24,  -8,  0,  8,  and  24  radians 
at  the  pupil  edge.  The  conventional  clear-aperture  PSF  is  shown  in  the  bottom  row  of  plots 
for  the  same  values  of  defocus. 


lutions,  and  thus  provides  a  fertile  setup  to  explore  such  tradeoffs  in  computational  imaging. 
The  generalization  of  this  concept  to  include  the  vector  nature  of  the  electromagnetic  field 
and  thus  to  provide  a  potential  polarimetric  imager  is  currently  being  investigated.  While 
this  work  was  not  a  direct  result  of  any  task  initially  proposed  under  the  current  grant,  its 
serendipitous  discovery  has  given  a  new  direction  and  impetus  to  our  work. 

1.7  Statistical  Bayesian  Analysis  of  3D  Source  Super- localization 
Using  a  Rotating-PSF  Imager 

We  have  analyzed  certain  statistical  upper  bounds  on  the  performance  of  a  rotating-PSF- 
based  imager  for  a  complete  3D  super-localization  and  super-resolution  of  point  sources  and 
compared  its  performance,  via  these  upper  bounds,  to  that  of  a  conventional  imager.  Two 
kinds  of  Bayesian  estimators,  specifically  the  mean  and  mode  of  the  posterior  probability 
density  function  (PDF),  are  adopted  for  our  calculations.  The  first  is  associated  with  the 
minimum  mean-squared  error  (MMSE)  and  the  latter  with  the  minimum  probability  of  error 
(MPE)  in  a  multi- hypothesis  testing  (MHT)  based  Bayesian  inference.  The  two  error  bounds 
provide  somewhat  different  quantitative  metrics  of  performance,  but  are  closely  related  at 
high  SNR  [2]. 

The  problem  of  localizing  a  point  source  to  sub-diffractive  uncertainties  in  3D  may  be 
phrased  in  terms  of  the  minimum  error  in  localizing  the  source  to  within  one  of  x 
M||  possible  sub-voxels  into  which  a  nominal  voxel,  corresponding  to  the  diffraction-limited 
resolution  volume,  is  subdivided  uniformly.  The  integers  M±_  and  My  represent  the  transverse 
and  axial  localization  enhancement  factors,  respectively.  The  problem  can  be  posed  either 
as  a  spatial  error  bound  problem,  described  by  the  MMSE  metric,  or  as  an  MHT  problem, 
described  by  the  MPE  metric.  The  Bayesian  prior  in  each  case  must  be  chosen  as  being 
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2D,  pseudoGaussian  noise,  16x  resolution 


Figure  9:  MMSE  vs.  FNR  for  conventional  and  rotating-PSF  based  imagers  for  2D  source 
localization,  at  different  background  levels. 


uniform  over  the  M]_  x  My  sub- voxels. 

Some  preliminary  results  [4,5]  for  the  two  approaches  under  a  combination  of  sensor  read 
noise  and  signal  and  background  based  shot  noise  sources  are  presented  in  Figs.  9-10  below. 
For  the  MMSE  based  description  of  transverse  (2D)  source  localization,  we  find  that  at 
sufficiently  high  SNR,  defined  here  as  the  source  fiux-to- noise  ratio  (ENR),  the  MMSE  is 
reduced  well  below  its  zero-SNR  value,  namely  the  prior  variance.  A  reduction  of  the  MMSE 
by  a  factor  M'j_  represents  an  M^-fold  transverse  super-localization,  with  the  associated  FNR 
providing  a  lower  bound  on  the  FNR  needed  to  achieve  it.  As  seen  in  Fig.  9,  the  conventional 
imager  at  best  focus  yields  a  rapid  decrease  in  2D-localization  MMSE  with  increasing  ENR, 
but  its  behavior  is  reversed  at  large  defocus,  e.g.,  at  C  =  16  radians  for  which  the  MMSE 
shows  little  reduction  even  at  ENR  =  40  dB.  By  contrast,  the  rotating-PSF  imager  has  a 
rather  robust  2D-locahzation  performance,  with  all  of  the  MMSE  vs.  FNR  curves  closely 
bunched,  over  any  defocus  phase  between  0  and  16  radians. 

Figure  10  captures  similar  2D-locahzation  trends  for  the  two  imagers  from  the  MPE 
perspective.  Eor  N  pixels  of  image  data,  when  N  »  1  it  is  possible  to  develop  an  asymptotic 
analysis  of  the  MPE  for  an  image-based  Bayesian  MHT  problem  that  involves  complimentary 
error  functions  [5].  The  agreement  between  the  exact  and  asymptotic  MPE  values,  as  the 
figure  shows,  is  quite  good  at  large  FNR. 

We  have  also  confirmed  a  cross-over  behavior  of  the  MPE  vs.  defocus  for  2D  localization 
between  the  two  imagers,  since  the  conventional  imager  has  a  better  performance  at  the 
plane  of  best  focus  but  rapidly  degrades  with  increasing  defocus  while  the  rotating-PSE 
imager  has  a  rather  constant  performance  across  a  large  range  of  defocus.  This  behavior  is 
consistent  with  the  fact  that  with  increasing  defocus  the  rotating  PSE  maintains  its  size  and 
shape,  but  the  conventional  PSF  spreads  rapidly  and  thus  loses  its  sensitivity  to  localize  a 
source  in  the  transverse  plane. 

Einally,  we  have  also  computed  the  exact  and  asymptotic  values  of  the  MPE  for  full  3D 
source  super-localization.  An  interesting  competition  between  transverse  and  axial  (depth) 
localization  enhancement  is  seen  here,  with  the  latter  providing  the  limiting  behavior  of 
the  MPE  in  the  limit  of  high  ENR.  Also,  the  conventional  imager  performs  poor  depth 
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Figure  10:  Plots  of  MPE  vs.  FNR  for  the  rotating-PSF  imager  for  two  values  of  defocus, 
(a)  ^  =  0;  (b)  C  =  16.  (c)  The  same  plot  for  the  conventional  imager  for  the  same  two  values 
of  (.  The  asymptotic  MPE  values  are  shown  by  dashed  line  segments  without  any  marker 
symbols. 


localization  at  the  plane  of  best  focus  where  it  has  vanishing  first  order  derivative  relative 
to  defocus  and  thus  no  first-order  sensitivity  to  depth  localization.  These  and  other  details 
of  full  3D  localization  are  contained  in  Ref.  [5]. 

1.7.1  A  Bayesian  MPE  Based  Analysis  of  2D  Point-Source- Pair  Superresolution 

In  a  second  recently  submitted  paper  [6],  a  related  problem  of  the  optical  superresolution 
(OSR)  of  a  pair  of  equal-brightness  point  sources  separated  spatially  by  a  distance  (or  angle) 
smaller  than  the  nominal  diffraction-limited  resolution  spacing  (or  angle)  has  been  analyzed 
from  the  same  Bayesian  perspective  of  the  MPE.  The  question  was  posed  as  a  Bayesian 
binary-hypothesis  testing  problem  of  discriminating  between  a  null  hypothesis  -  that  there 
is  a  single  point  source  with  double  the  brightness  of  each  point  source  in  the  pair  -  and  an 
alternative  hypthesis  -  that  there  are  indeed  two  point  sources.  This  problem  was  studied  un¬ 
der  the  same  noise  and  background  conditions  under  which  we  studied  3D  super-localization 
discussed  in  the  previous  subsection.  The  correct  hypothesis  is  chosen  with  little  error, 
determined  by  the  MPE  dropping  below  a  lower  threshold  typically  at  5%,  equivalent  to 
a  95%  statistical  confidence  level,  provided  the  source  brightness  exceeds  a  certain  upper 
threshold  that  is  determined  by  the  smallness  of  the  (sub-diffractive)  source  separation  and 
background/noise  levels.  Without  going  into  the  details  of  a  rather  complex  calculation  pre¬ 
sented  in  the  paper  [6],  which  is  based  on  some  asymptotically  valid  theoretical  expressions 
for  the  MPE  derived  in  the  earlier  paper  [5],  we  present  only  the  most  important  findings  of 
this  work. 

Eirst,  the  minimum  brightness  needed  to  achieve  pair  OSR  in  the  signal-dominated  regime 
scales  approximately  as  the  fourth  power  of  the  inverse  source-pair  spacing,  but  the  scaling  is 
modified  from  its  quartic  behavior  by  logarithmic  corrections  that  have  not  been  inferred  by 
any  of  the  previous  analyses  of  this  problem.  Second,  in  the  background-dominated  regime, 
the  same  scaling  law  is  moderated  to  a  pure  quadratic  form,  with  a  slow  transition  from 
the  logarithmically  modified  quartic  power  law  dependence  as  the  signal  strength  increases 
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Figure  11:  Log-log  plots  of  the  minimum  source  strength,  Kmin,  vs.  the  OSR  ratio,  w/d, 
for  the  three  different  background  levels. 


relative  to  the  background  level.  These  scaling  laws  are  seen  in  Fig.  11  where  we  plot  the 
minimum  source  strength,  Kmin  (in  photon  number  units),  versus  the  inverse  ratio  of  the 
pair  spacing,  d,  to  the  diffractive  resolution  spacing,  w,  on  a  log-log  plot.  The  slow  change 
of  the  slope  of  the  log^Q  Kmin  vs.  \ogiQ{d/w)  from  4  to  2  with  increasing  background  levels 
is  quite  evident  in  this  plot. 

More  details  of  this  work  may  be  found  in  Ref.  [6]  where  we  present  detailed  derivations 
of  the  various  scaling  laws  under  varying  conditions  of  signal  to  background  ratios.  While 
the  pair  OSR  problem  was  analyzed  here  in  the  plane  of  best  focus  for  a  Gaussian  PSF  based 
imager,  the  analysis  can  be  readily  generalized  to  other  PSFs  and  to  full  3D  pair  OSR  as 
well.  This  is  currently  being  studied  by  the  UNM  group. 
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